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CONCENTRATION OF THE INFORMATION IN DATA WITH 
LOG-CONCAVE DISTRIBUTIONS 

By Sergey Bobkov^ and Mokshay Madiman^ 

University of Minnesota and Yale University 

A concentration property of the functional — log/(X) is demon- 
strated, when a random vector X has a log-concave density / on R". 
This concentration property implies in particular an extension of the 
Shannon-McMillan-Breiman strong ergodic theorem to the class of 
discrete-time stochastic processes with log-concave marginals. 

1. Introduction. Let (0,,B,P) be a probability space and let X = {Xi , . . . , 

Xn) be a random vector defined on it with each Xi taking values in M. Sup- 
pose that the joint distribution of X has a density / with respect to a 
reference measure v{dx) on M". For most of this paper (except for the pur- 
poses of discussion in this section), the reference measure is simply Lebesgue 
measure dx on M". The random variable 

h{X) = -\ogf{X) 

may be thought of as the (random) information content of X. Such an 
interpretation is well-justified in the discrete case, when u is the counting 
measure on some countable subset of M" on which the distribution of X is 
supported. In this case, h{X) is essentially the number of bits needed to 
represent X by a coding scheme that minimizes average code length [21]. In 
the continuous case (with reference measure dx), one may still call h[X) the 
information content even though the coding interpretation no longer holds. 
In statistics, one may think of the information content as the log likelihood 
function. 
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The average value of the information content of X is known more com- 
monly as the entropy. Indeed, the entropy of X is defined by 

h{X) = - J /(x)log/(x)dx = -Elog/(X). 

Observe that we adopt here the usual abuse of notation: we write h{X) 
even though the entropy is a functional depending only on the distribution 
of X and not on the value of X. In general, h{X) may or may not exist 
(in the Lebesgue sense); if it does, it takes values in the extended real line 
[-00, +00]. 

Because of the relevance of the information content in various areas such 
as information theory, probability and statistics, it is intrinsically interesting 
to understand its behavior. In particular, a natural question arises: is it true 
that the information content concentrates around the entropy in high di- 
mension? In general, there is no reason for such a concentration property to 
hold. A main purpose in this note is, however, to show that when the prob- 
ability measure on M" of interest is absolutely continuous and log-concave, 
logf{X) does possess a powerful concentration property. Specifically, we 
prove the following theorem. 

Theorem 1.1. Suppose X = {Xi, . . . ,Xn) is distributed according to a 
log-concave density f on M". Then, for all t> 0, 

P{\h{X) - h{X)\ > t^/^} < 2e-^\ 

where c> is a universal constant. In fact, one may take c = 1/16. 

Note that under the assumption of log-concavity and absolute continuity, 
h{X) always exists and is finite (see, e.g., [6]). 

Let us emphasize that the distribution of the difference h{X) — h[X) is 
stable under all affine transformations of the space, that is, 

h{TX) - h{TX) = h{X) - h{X) 

for all invertible affine maps T:M" — )• M". In particular, the variance of the 
information content 

B\h{X) - h{X)\'^ 

represents an affine invariant. By Theorem 1.1, when / is log-concave, this 
variance is bounded by Cn with some universal constant C. 

In facjt, the deviation inequality in Theorem 1.1 amounts to a stronger 
bound ||/i(^) — h{X)\\^-^ < C^/n with respect to the Orlicz norm, generated 
by the Young function 'i/'i(i) = e'*' — 1. This is consistent with the observation 
that in many standard examples /j(X) behaves like the sum of n independent 
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random variables. For example, when X is standard normal, we have 

" X2 - 1 

1=1 

More generally, \i X = (Xi, . . . has independent components, then 

n 

h{X) - h{X) = YJ^{Xi) - h{X,). 
1=1 

These examples show that -^7!- normalization in Theorem 1.1 is chosen cor- 
rectly and cannot be improved for the class of log-concave distributions. 

When the dimension n is large, the exponential decay in Theorem 1.1 
may be improved to the Gaussian decay on the interval < t < 0{y/n). 

Theorem 1.2. Given a random vector X in M" with log- concave den- 
sity f, 

p|-^|log/(X) - Elog/(X)| > t| < 3e-^*', 0<t<2^, 
where c> is a universal constant. In fact, one may take c= 1/16. 

Substituting t = s^/n, rewrite the above inequality as 
(1.1) 



1 1 h{X) 
-log 



n ^f{X) n 

for < s < 2. Equivalently, in terms of the entropy power A^(-'^) = exp{— ^ x 
Elog/(X)}, we get for the value, say, s = l, 

P{iV(X)e-2/" < /(x)2/" < iV(X)e2/"} > 1 - 36""/^^ 

Thus, with high probability, /(x)^/" is very close to A^(X), and the distribu- 
tion of X itself is effectively the uniform distribution on the class of typical 
observables, or the "typical set" [defined to be the collection of all points 
X € M" such that /(x) lies between and Q-K^)+n>^ ^ for some small 

fixed e > 0] . 

A similar concentration inequality was obtained by Klartag and Mil- 
man [15], who compared the value f{X) to the maximum M of the density 
/ and proved that 

with some absolute constants co,ci G (0, 1). Note this result readily follows 
from Theorem 1.2, but not conversely. 

Theorems 1.1 and 1.2, by entailing an effective uniformity of the distribu- 
tion of X on some compact set, provide a strong, quantitative formulation of 
the asymptotic equipartition property for log-concave measures. To describe 
this interpretation, suppose X = {Xi,X2, ■ ■ ■) is a stochastic process on the 



4 



S. BOBKOV AND M. MADIMAN 



probability space (r2,0,P), with each Xj taking values in M, and define the 
corresponding projections X^"^ = (Xi, . . . If X is stationary, the limit 

exists as long as the increments h{X^'^~^^^) — h{X^'^^) are finite, and is called 
the entropy rate of X. For stationary processes X, the question of whether 

the information content per coordinate — - converges to the limit /i(X) 
(in LP or in probability or almost surely) has been extensively studied. In 
the discrete case, the affirmative answer to this question goes back to Shan- 
non [21], McMillan [17] and Breiman [10], and the eponymous theorem has 
been called "the basic theorem of information theory." The continuous case 
was partially developed by Moy [18], Perez [20] and KiefFer [14]. The defini- 
tive version [almost sure convergence for stochastic processes defined on a 
standard Borel space, and allowing more general reference measures ^{dx) 
than Lebesgue and counting measure] is due independently to Barron [3] 
and Orey [19]; the former in particular gives a clear exposition and recount- 
ing of the history. Specifically, these works imply that if X is stationary and 
ergodic, then, as n — )• oo, 

(1.2) _iiog/(xW)^/i(x) a.s. 

n 

An elementary proof of this fact, called by McMillan the "asymptotic equipar- 
tition property" was later given by Algoet and Cover [1]. For nonstationary 
processes with arbitrary dependence, the entropy rate /i(X) typically does 
not exist; so there is no question of a statement like (1.2) holding. Nonethe- 
less, together with Borel-Cantelli's lemma Theorem 1.1 immediately yields 
the following extension of the Shannon-McMillan-Breiman phenomenon. 

Corollary 1.3. Suppose that X has a log-concave distribution on M°° 
with absolutely continuous finite- dimensional projections. If the limit /i(X) 
exists, the property (1.2) holds. 

Note that log-concavity of a probability measure is defined on arbitrary 
locally convex spaces via a Brunn-Minkowski type inequality and is equiva- 
lent to the log-concavity of densities of finite-dimensional projections (in case 
they are absolutely continuous with respect to Lebesgue measure; see [9] for a 
general theory). Corollary 1.3 trivially extends to processes X = {Xi,X2, ■ ■ ■) 
where each Xi takes values in R*^ instead of M, as long as the finite-dimensional 
projections have lo g-concave distributions. This, for instance, means 

that Corollary 1.3 can be applied to nonstationary Markov chains in 
that preserve log-concavity of the joint distribution and also have a unique 
invariant probability measure (the latter condition ensures existence of the 
entropy rate, which can also be easily computed as the mean under the in- 
variant measure of the entropy of the conditional density of X2 given Xi). 
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Furthermore, if the process mixes weh enough so that /i(X("))/n converges 
rapidly to h(X), then Theorem 1.2 may be used to give a convergence rate 
in probability. 

It should also be mentioned that, for Gaussian distributions, tight concen- 
tration inequalities may be derived by simple explicit calculation. This was 
done by Cover and Pombra [11] as an ingredient in studying the feedback 
capacity of time- varying additive Gaussian noise channels. 

The paper is organized in the following way. As a first step, we consider a 
one-dimensional version of Theorem 1.1 (Section 2). In Section 3, we recall 
some previous work on reverse Lyapunov inequalities, and present a new 
variant. It is applied to establish a concentration property of the logarithm 
function under what we call log-concave measures of order p (Sections 4 
and 5). Section 6 uses the localization lemma of Lovasz and Simonovits to 
reduce the general case to a specific one-dimensional statement. Section 7 
completes the proof. 

2. One-dimensional case in Theorem 1.1. We begin by proving the one- 
dimensional case of Theorem 1.1. 

Proposition 2.1. If a random variable X has a log-concave density f, 
then, 

5,g(l/2)|log/(X)-Elog/(X)| 

Proof. Let X be a random variable with log-concave density /(x). The 
distribution of X is supported on some interval (a,b), finite or not, where / 
is positive and log / is concave. Introduce the function 

I{t) = f{F~\t)), 0<t<l, 

where : (0, 1) — t- (a, 6) is the inverse to the distribution function F{x) = 
P{X < x}, a < X <b. The function / is positive and concave on (0, 1) and 
uniquely determines F up to a shift parameter ([4], Proposition A.l). 
Given a function ^ = ^{u,v), write a general identity 

^{f{x)J{y))f{x)f{y)dxdy= [' [\ {I (t) , I (s)) dt ds . 



Jo 



In particular, for any a G [0, 1), 

(2.1) j j e"l'°s/(^)-'°S-^(3')ldF(x)dF(y) = ^^^^e"l'°sm-iog/WI^^^5_ 

Here the right-hand side does not change when multiplying / by a positive 
scalar, so one may assume that /(1/2) = 1/2. But then, by concavity of /, 
we have 

min{t, 1 - t} < I{t) < 1. 
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From this, 

log/(i) — log/(s) < — logminjs, 1 — s}, 
log/(s) — log/(t) < — logminjt, I — t}, 

so 

|log/(t) — log/(s)| < — logminjt, s, 1 — t, 1 — s}. 
Hence, the right-hand side of (2.1) does not exceed 

/ / g-alogmin{M,l-t,l-4^it^S = 4 / / min{t, s}"" 

Jo Jo Jo Jo 

2l+a 



(l-a)(2-a)' 

Finally, by Jensen's inequality with respect to dF{y), the left-hand side 
of (2.1) majorizes 

^a\logf{x)-Jlogf(y)dF(y)\ j^^^^ ^ j.ga|log /(X)-E log ^ 

SO that we have 

(2 2) J, a|log/(X)-Elog/{X)| <■ 

^ ' ' - (l-a)(2-a)' 

Choosing the value a = 1/2, and observing that |\/2 < 4, we may conclude. 
□ 

Note also that a direct application of Chebyshev's inequality yields 

P{|log/(X) - Elog/(X)| >t}< 4e"*/2 

for all t > 0. While the exponent here is slightly better than that in The- 
orem 1.1, we make no effort here (or anywhere in this paper) to come up 
with optimal constants. 

3. Reverse Lyapunov inequalities. Given a random variable ?7 > 0, the 
Lyapunov inequality states that 

(3.1) A^^A:^-^ > A^-^ a>b>c>0, 

where Xp = Erj^ is the moment function of t]. Equivalently, it expresses a 
well-known and obvious property that the function p — )• log Xp is convex on 
the positive half-axis p> 0. 

What is less obvious, for certain classes of probability distributions on 
(0,+oo), the inequality (3.1) may be reversed after a suitable normalization 
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of the moment function. In particular, when rj has a distribution with in- 
creasing hazard rate (in particular, if t] has a log-concave density), then as 
was shown by Barlow, Marshall and Proschan ([2], page 384), we have 

(3.2) A^'A^''<^^^ a>b>c>l (c is integer), 
for the normalized moment function 

1 

r(p"+ 1) 

Note that Ap = 1 for all p > for the standard exponential distribution, 
which thus plays an extremal role in this class. 

This result has many interesting applications. For example, applying it to 
the parameters a = p+ 1, b = p, c = p — 1, we have 

(3.3) • Et]^"! - + ^) 

provided that p > 2 is integer. If the distribution of rj is log-concave, the case 
p = l can also be included in this inequality, which is due to a Khinchine-type 
inequality by Karlin, Proschan and Barlow [13], namely, 

ErjP <r{p+l){Er]y, p>l. 

However, in some problems, it is desirable to remove the requirement that 
c is integer in (3.2). This is implied by results of Borell [8] for the class of log- 
concave densities. To be more precise, he proved the following (Theorem 2 
in [8]). 

Proposition 3.1. Let t] be a nonnegative concave function, defined on 
an open convex body Q C . Then the function 

n\ Jn 

is log-concave in p>0. 

To relate this to (3.2), let us start with a continuous convex function 
: A — 7- M, defined on some closed segment A C (0, +oo), such that e"""^^^ is 
a probability density. For large n, consider convex bodies 

u{x) 



^n = {ixi,...,Xn,x)£R1x A:Xi-\ \-Xn<l 

[ n 
Their volumes satisfy, as n — t- oo , 

(3.4) n!|0„| = J (^1-^^ dx^J e-"(^)(ix = l, 
and for every > 0, 

(3.5) = / dxi - ■ -dxndx v{p) = I x^e"^^"^^ dx. 
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By Proposition 3.1, applied to r/(xi, . . . the functions 

(p + 1) ■ • ■ (p + n) 

Wn{p) = r-]— j Vn{p), P>0, 

are log-concave, so the limit will also be a log-concave function, if it exists. 
(Note that we have added a log-linear factor n^'"^^.) But 

(p + 1) • • • (p + n) 1 

nP+in! r(j3+l)' 

Therefore, in view of (3.4) and (3.5), the resulting limit ^[p+i) '^^) represents 
a log-concave function, as well. 

On this step, the assumption that u was defined on a closed segment can 
be relaxed, and we arrive at the following corollary (which seems not to be 
mentioned in [8] or anywhere else). 

Corollary 3.2. // a random variable i] > has a log-concave distribu- 
tion, then the function 

is log-concave. Equivalently, we have a reverse Lyapunov's inequality 
(3.6) A^"A:?-'' < A;^-^ a > 6 > c > 0. 

In connection with the concentration problem and the Kannan-Lovasz- 
Simonovits conjecture within the class of spherically symmetric distributions 
on M" , reverse Lyapunov's inequalities were considered in [5]. The following 
alternative variant of Corollary 3.2 is proposed there. 

Proposition 3.3. Given a random variable rj > with a log-concave 
distribution, the function Ap = E(^)p is log-concave in p> 0, and therefore 
satisfies (3.6). 

This is proved in [5] by an application of the Prekopa-Leindler inequality, 
and is perhaps more convenient for applications involving asymptotics. 

There is much more that can be (and has been) said about reverse Lya- 
punov inequalities; a gentle introduction may be found in [7]. 

4. Log-concave distributions of order p. 

Definition 4.1. A random variable ^ > will be said to have a log- 
concave distribution of order p > 1 , if it has a density of the form 

f{x)=xP-^g{x), x>0, 

where the function g is log-concave on (0, -|-oo). 
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When p = 1, we obtain the class of all (nondegenerate) log-concave prob- 
ability distributions on (0,+oo). 

The meaning of the parameter p is that it is responsible for a strength- 
ened concentration. For example, the inequality (3.3), which holds by Corol- 
lary 3.2 for all real p>l, may equivalently be rewritten in terms of ^ as 

(4.1) Var(e) < -(EO'. 

P 

Alternatively, if we start with Proposition 3.3 and apply (3.6) with a = p + 1, 
b = p, c = p — 1 {p > 1), we get Er^^^-'^Er/P"^ < Cp(E7yP)^ with constants 
Cp = {p+ l)P+^ip - l)P~^p~^P. Equivalently, 

(4.2) Var(0<(Cp-l)(Ee)2 

in the class of log-concave ^ of order p. Asymptotically Cp = 1 + ^ + 0{-^), 
as 7- +00, so the bound (4.2) is very close to (4.1) for large values of p. 

Example 4.2. Let ^ have a Gamma distribution with shape parameter 
p (where p > is real), that is, with density 

/(x) = -i^xf- V^ x>0. 

It is log-concave if and only if p > 1, in which case p will be the order of 
log-concavity for this distribution. Note that E^ = Var(^) = p, and (4.1) 
becomes equality. Hence, the factor 1/p in (4.1) is optimal. 

Proposition 4.3. i/^ > has a log-concave distribution of order p > 1, 
then 

Var(logO < ^logr(p). 
Equality is attained at the Gamma distribution with shape parameter p. 

Proof. Write the density of ^ as f{x) = xP~^g{x) with log-concave g. 
One may assume that 5 is a density, as well. Indeed, otherwise consider 
random variables E,c = (c > 0). Then Var(log^c) = Var(log.^) and has 
density 

Ux)=c-PxP-^g{x/c)=xP-^g,{x), 

where gc{x) = c~Pg{x/c). Since / decays at infinity exponentially fast, the 
same is true for g. Hence, g is integrable, and one can choose c such that 
J gc{x) dx = 1. So the reduction to the case where g is a, density is achieved. 

Thus, let (7 be a log-concave probability density, such that f{x) = xP~^g{x) 
is the density of ^. Consider a random variable > with density g. Then, 
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by Corollary 3.2, the function 

u{q)=log-Eif-'-logr{q), q>0, 
is concave. Differentiating twice with respect to q, we get 

« (?) = ^^^^2 

But at the point q=p, we have 

BrjP-^ = J xP-^g{x) dx = j f{x) dx = 1, 

and so 

d^ 

u"{p) + —r logr(p) = Bif-^ log2 7] - (Er/«-^ log?7)^ 

x^~^log^ xg{x) dx — I / x^~^ logxg{x) dx 



= Var(log^). 
Proposition 4.3 is proved. □ 

It is to be noted that the right-hand side in Proposition 4.3 is the trigamma 
function, which has the alternate representation 



oo ^ 
n=l 



and behaves like 1/p for large values of p. Hence, 

(4.3) Var(logO<- 

P 

with some absolute constant C (in fact, one may take C = 1). This can also 
be seen by using Proposition 3.3. Indeed, the same argument as above yields 

(4.4) Var (log e) < (p - 1) log(p - 1) = ^ , 

ap^ p — I 

which holds for any p> 1. Here the right-hand side has an incorrect behavior 
when p is close to 1. In fact, for all log-concave ^, we have 

(4.5) Var(logO<(:7 

with some absolute constant C. For the proof, one can apply, for example, 
Borell's concentration lemma ([9], Lemma 3.1). Together with (4.5), (4.4) 
also yields the bound (4.3). 

In the proof of Theorem 1.1, we use the values p = n, the dimension of 
the space. Since the one-dimensional case can be treated separately (rather 
easily), the assumption p>2 can be made in applications. 
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Remark 4.4. The notion of a log-concave measure of order p may be 
extended in a natural way to the class of one-dimensional log-concave prob- 
ability measures // on R". More precisely, we say that has order p, if fi 
is supported on some interval A C M", bounded or not, and has a density 
there of the form 

where £ is a positive affine function on A, (7 is log-concave on A, and where dx 
stands for the Lebesgue measure on this interval. In this case, the inequality 
(4.3) and other similar results should be properly read in terms of i. For 
example, we have Var(log^) < ^ with respect to fx. 

5. Concentration of the logarithm function. It is natural to try to sharpen 
Proposition 4.3 and the resulting asymptotic bound (4.3) in terms of devi- 
ations of log^ from its mean or quantiles. 

Let ^ > be a random variable with log-concave distribution of order 
p -|- 1, that is, with density of the form 

f{x)=xPg{x), x>0, 

where p>0 and g is a log-concave function. Let (" be an independent copy 
of ^. Then for all a G [0,p], 

x^~^°'g{x)dx / x^~°'g{x) dx. 



The quantity Ee"l^°s^~'°s''l does not change if we multiply ^ and C by a 
positive scalar. Hence, as in the proof of Proposition 4.3, we may assume 
that 5 is a probability density of some random variable, say, r]. Applying 
Jensen's inequality, we thus conclude that 

(5.1) Ee°l'°s?-Eiog?| < 2E7?P+"Er/P-", < a < p, 

provided that Er/^ = 1 (which means that / is a density) . But by the reverse 
Lyapunov's inequality of Corollary 3.2, applied with a = p-\-a, b = p, c = 
p — a, we obtain that 

E7?P+-Er?P-" + " + - « + 1) 



r(p+l)2 

Note that when q = 1, this inequality returns us to inequality (4.1). 
Thus, from (5.1), 

Ee"|i°sC"Eiog^| < ^r{p + a + mp-a + l)^ 0<a<p. 

r(p + ly 
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The right-hand side here seems perhaps not quite convenient to deal with, 
especially when pziz a are not integer. Alternatively, it might be better to 
use Proposition 3.3, which gives 

Ee"ii°g?-Eiog^i < ,(P+«r"(P-«r" < ^ < 

p2p - -F 

Indeed, write 

2 \ p—ct / \2a 



{p + aY+°'{p-aY-'^ _ ( ^ cry- / _^ « 



p2p \^ p2 J y p 

The first factor on the right may be bounded just by 1. For the second one, 
using (1 + 1)^/* < e (t > 0), one has 

1 + ^ J = M + < e2°'/P. 

Therefore, we have a preliminary Gaussian estimate: 

j.ga|log5-Elog5| < 2e2"Vp^ < a < p. 

Similarly, one may also obtain a one-sided estimate, since like in inequality 

(5.1) we also have 

5,ga{iog?-iogC) ^ Er?P+"Er/f-", < |a| < p, 

provided that Ery^ = 1. These estimates are collected below after replacing 
p by p — 1 for convenience. 

Lemma 5.1. //^ > has a log-concave distribution of order p > 1, then 

(5.2) Ee"l^°§«-^'°s5l < 2e2"V(p-i)^ < a < p - 1, 

(5.3) Ega(iog?-EiogO <g2aV(p-i)^ 0<|a|<p-l. 

In particular, we obtain for log-concave densities of order p on the positive 
half- line ap-dependent version of Proposition 2.1 (which was stated for log- 
concave densities on the line). 

Corollary 5.2. /f ^ > has a log-concave distribution of order p > \, 
then 

Proof. First, assume p > 2 and choose a = c^Jp in (5.2) with < c < 
l/\/2 (so that a < p — 1). Then, using pj (p — 1) < 2, we have 

EgC^liog^-Eiog^l ^ 2e^'^^ 

Taking, for example, c = 1/6, the right-hand side will not exceed 2e^/^ < 3. 
Hence, 

Ee(V6)Vp|iogC-EiogC| ^ 3_ 
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For the remaining range 1 <p < 2, one has < 1/4, and we have by 

Proposition 2.1 [or more precisely, inequahty (2.2)] that 

A) yJVp/<iWog^-Eiog^\ ■p„{i/4)|iog5-i=;iogC| < ^^^^ ^ o 
^ ' -3/4x7/4 

Thus, the desired statement is proved with a uniform bound of 3. □ 

Observe that Proposition 2.1 corresponds to p = 1, and that while it 
clearly applies as stated to log-concave densities of order p (since these are 
subclasses of the log-concave densities). Corollary 5.2 with the additional 
y/p term in the exponent provides the correct generalization for large p. 

6. Reduction to dimension one. To reduce Theorems 1.1 and 1.2 to a 

specific statement about dimension one (in fact — about log-concave distri- 
butions of order p = n), we apply a localization argument of Lovasz and Si- 
monovits [16]. More precisely, we need one variant of the localization lemma, 
proposed in [12], Corollary 2.4, which we state with minor modification as 
a lemma. 

Lemma 6.1. Let g and h be integrable continuous functions on a bounded 
open convex set ft in M", such that 

g{x)dx>0, / h{x)dx = 0. 
Jn 

Then for some interval Acil, and a positive affine function £ on A, 

gr-^>o, [ hr~^ = o, 

J A 

where the integrals are with respect to Lebesgue measure on A. 

Equivalently, given that h{x) dx = 0, if for all couples (A, i) with j^hx 
£^-^ = 0, we have that 



gr-'<o, 

A 



then 



/ 

Jn 



g{x) dx < 0. 

in 

This formulation enables the desired-dimensional reduction. 

Lemma 6.2. Suppose X is a random vector taking values in an open 
convex set Q in M", where it has a positive continuous density f , such that 
E|log/(X)| is finite. Let fi£ denote a probability measure on a line segment 
A C O with density 

fdx) = y{x)e{xr-\ 
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where £ is a positive affine function, defined on A, and Z = f{x)£{x)^~^ dx 
is a normalizing constant. Given a > and A>1, if for any such one- 
dimensional measure fii, we have 

(6.1) E^gKv^)|log/-Eflog/| < 

where stands for the expectation with respect to fii, then 

(6.2) Eexp|^|log/(X) - Elog/(X)|| < A 



Proof. Without loss of generality, take to be bounded, and assume 
that Elog/(X) = 0, or in other words, 

(6.3) / log/(x)/(x)dx = 0. 

Jn 

In this case, (6.2) becomes 

(6.4) / (g(a/v^)|log/W| _ ^^j^^^ < Q 



Jn 

This corresponds to Lemma 6.1 with 

h{x) = log f{x)f{x) and 5(2;) = (e(°/v^)l^°s^(^)l - ^)/(x). 

Hence, to derive (6.4) under (6.3), it suffices to take an arbitrary interval 
A C and a positive affine function £ on A, such that 



(6.5) / \ogf{x)fix)iixr-Ux = 0, 
and to show that 

(6.6) / (e(°/^)l^°§^(^)l - A)/(x)£(x)"-idx<0. 
J A 

Using the definition of /i^, inequalities (6.5) and (6.6) take the form 

I log fdf,e = oJ (e("/v^)|i°s/l - A) dii, < 0, 

which can be written together as (6.1). □ 

7. Proofs of Theorems 1.1 and 1.2. Keeping the same notation as in the 
previous section, first note that 

log / - E, log / = (log /, - E, log /,) - (n - 1) (log I - E, log I) , 



so 



|log/ - E,log/| < |log/, - E^log/,| + (n - l)|log£ - E,log£|. 
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By convexity of the functional ^ — t- logEe^, we have that 
(7.1 



+ i log E^e("(""^)/^)l^°^^"^'^ _ 

Since fi is the density of the one-dimensional log-concave probability 
measure fi£, by Proposition 2.1, whenever < a < ^\/n, 

(7^2) E^g("/v^)|los/£-Ef log/fl ^ 

To estimate the second expectation in (7.1), it is useful to note that fi£ 
has order p = n (cf. Remark 4.4). If n = 1, this expectation is just 1. If n > 2, 
by the inequality (5.2) of Lemma 5.1, we have 

(7.3) ^^e^»{ri-^)/V^Woge-Eeioge\ ^ 2e2"^("-i)/'" < 2e^"^ 

provided that < a < ^/n. This bound automatically holds for n = 1, as 
well. 

Collecting the bounds (7.2) and (7.3) in (7.1), we get that, for all < a < 

logE,e("/(2V^»|i°s^-E^i°s^l < llog(8e2"'). 
Hence, using \/8 < 3 (to simplify the constant), 

Now, replace a with 2a. We then get that 

E^e("/v^)l'°s/-E.iog/l < 3e4a2^ < a < |^/^. 

Recalling Lemma 6.2 (whose assumptions hold for all log-concave densities), 
we arrive at the following theorem. 

Theorem 7.1. Given a random vector X iriR" with log-concave density 
fix), 

E exp| ^ |log / (X) - E log / (X) 1 1 < 3e^"' , 0<a<^V^. 

Choose a = 1/4. Denoting ^ = ^^|log/(X) -Elog/(X)|, we have Ee^ < 
3e^/^. Hence, Ee«/^ < 3^/''e^/^'^ < 2. This gives the following. 

Corollary 7.2. Given a random vector X in M" with log-concave den- 
sity fix), 

Eexp|^^|log/(X)-Elog/(X)||<2. 
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By applying Chebyshev's inequality, we arrive at Theorem 1.1 with c 
1/16. From Theorem 7.1, by Chebyshev's inequality, we also have 

p|^|log/(X) - Elog/(X)| >t\< 3e"°'-°*, t > 0, 



provided that < a < \y/n. Taking the optimal value a = t/8 gives Theo- 
rem 1.2. 
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